Lazy Credal Classifier and how to compare credal classifiers

نویسنده

G. Corani

چکیده

This poster carries out two main contributions: (a) a lazy (or local) version of naive credal classifier (NCC) that we call lazy naive credal classifier (LNCC); (b) two metrics to compare credal classifiers. NCC [1] has extended naive Bayes (NB) to imprecise probabilities, by modeling prior ignorance via the Imprecise Dirichlet Model; the classification is eventually issued by returning the non-dominated classes; therefore, NCC returns a set of classes when faced with instances whose classification would be prior-dependent for NB. Extensive experiments have shown that NCC is more reliable than NB. Yet, two drawbacks of NCC are (i) that the naive assumption (statistical independence of the features given the class) might be too simplicistic and (ii) that in some cases NCC becomes too indeterminate. We address the two issues above proposing the local naive credal classifier (LNCC); this addresses point (i) because working locally reduces the chance of encountering strong dependencies. In addition, LNCC should improve also the determinacy of NCC, thus addressing (ii): by working locally, it selects the part of the learning set that is more informative about the istance to be classified. How do we select the number of instances to be used to train the local classifier? We keep on including instances in the local learning set until NCC starts issuing a determinate classification on the instance to classify (note that this clearly favors removing indeterminacy). The rationale behind this criterion is that the we select a local learning set that is informative enough to draw a strong conclusion, such as a determinate classification. We investigate the effect of the above choices by extensive experiments to compare LNCC with NCC. In order to compare LNCC and NCC, we propose (a) an indicator borrowed from multi-label classification and (b) a non-parametric rank test. To our knowledge, this is the first attempt to empirically compare credal classifiers. Results on 36 data sets show that, according to both tests, LNCC clearly outperforms NCC, as it significantly reduces indeterminacy without worsening (often improving) the overall accuracy. Acknowledgments Work partially supported by the Swiss NSF grant n. 200021-118071/1.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Likelihood-Based Naive Credal Classifier

Bayesian Classifiers Learn joint distribution P(C,F) Assign to f the most probable class label argmaxc′∈C P(c′, f̃) This defines a classifier, i.e., a map: (F1× . . .×Fm)→ C Credal Classifiers Learn joint credal set P(C,F) Set of optimal classes (e.g., according to maximality ) {c′ ∈ C |@c′′ ∈ C ,∀P ∈ P : P(c′′|f̃) > P(c′|f̃)} This defines a credal classifier, i.e., (F1× . . .×Fm)→ 2 May return mo...

متن کامل

Credal Classification based on AODE and compression coefficients

Bayesian model averaging (BMA) is a common approach to average over alternative models; yet, it usually gets excessively concentrated around the single most probable model, therefore achieving only sub-optimal classification performance. The compression-based approach (Boullé, 2007) overcomes this problem; it averages over the different models by applying a logarithmic smoothing over the models...

متن کامل

Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification

Bayesian network are powerful probabilistic graphical models for modelling uncertainty. Among others, classification represents an important application: some of the most used classifiers are based on Bayesian networks. Bayesian networks are precise models: exact numeric values should be provided for quantification. This requirement is sometimes too narrow. Sets instead of single distributions ...

متن کامل

Credal Classification for Mining Environmental Data

Classifiers that aim at doing reliable predictions should rely on carefully elicited prior knowledge. Often this is not available so they should be able to start learning from data in condition of prior ignorance. This paper shows empirically, on an agricultural data set, that established methods of classification do not always adhere to this principle. Common ways to represent prior ignorance ...

متن کامل

Tree-Based Credal Networks for Classification

Bayesian networks are models for uncertain reasoning which are achieving a growing importance also for the data mining task of classification. Credal networks extend Bayesian nets to sets of distributions, or credal sets. This paper extends a state-of-the-art Bayesian net for classification, called tree-augmented naive Bayes classifier, to credal sets originated from probability intervals. This...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Lazy Credal Classifier and how to compare credal classifiers

نویسنده

چکیده

منابع مشابه

Likelihood-Based Naive Credal Classifier

Credal Classification based on AODE and compression coefficients

Bayesian Networks with Imprecise Probabilities: Theory and Application to Classification

Credal Classification for Mining Environmental Data

Tree-Based Credal Networks for Classification

عنوان ژورنال:

اشتراک گذاری